Activity Mining in Open Source Software

نویسندگان

  • Daniel Mack
  • Nitesh V. Chawla
  • Greg Madey
  • Nitesh Chawla
چکیده

Open Source software repository is a collection of various projects with varying levels of activities, participations, and downloads. In this paper, we attempt to categorize and mine activity, thus discovering success, by focusing on the Games group. However, we observe that there are a significant number of projects within the Games category that are inactive or have no ranking assigned by Sourceforge. So, we first segmented our dataset based on just the ranking or activity assigned. We then constructed a set of characteristics, such as number of developers; number of posters; number of downloads, to identify associations, activity, and relationships in the smaller active set. We used regression rules and association rules to discover the associations and relationships. Contact: Prof. Nitesh Chawla Dept. of Computer Science and Engineering University of Notre Dame Notre Dame, IN 46556 Tel: 1-574-631-3637 Fax: 1-574-631-9260 Email: [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Antecedents of open source software defects: A data mining approach to model formulation, validation and testing

This paper develops tests and validates a model for the antecedents of open source software (OSS) defects, using Data and Text Mining. The public archives of OSS projects are used to access historical data on over 5,000 active and mature OSS projects. Using domain knowledge and exploratory analysis, a wide range of variables is identified from the process, product, resource, and end-user charac...

متن کامل

Data Mining User Activity in Free and Open Source Software (FOSS)/ Open Learning Management Systems

Free and Open Source Software (FOSS)/Open Educational Systems development projects abound in higher education today. Many universities worldwide have adopted open source software like ATutor and Moodle as an alternative to commercial or homegrown systems. The move to open source learning management systems entails many special considerations, including usage analysis facilities. The tracking of...

متن کامل

Data Mining for Software Defect Prediction with Open Source and Commercial Softwares

There has been an emerging open source data mining software development in current times. In this work I carry out an empirical study to compare the performances of open source and commercial data mining software. The results clearly reveals that before and after data preprocessing of the two NASA datasets used in this work, commercial data mining software (MATLAB) outperforms open source data ...

متن کامل

A Survey of Open Source Data Mining Systems

Open source data mining software represents a new trend in data mining research, education and industrial applications, especially in small and medium enterprises (SMEs). With open source software an enterprise can easily initiate a data mining project using the most current technology. Often the software is available at no cost, allowing the enterprise to instead focus on ensuring their staff ...

متن کامل

Investigating Open Source Project Success: A Data Mining Approach to Model Formulation, Validation and Testing

This paper demonstrates the use of Data Mining (DM) techniques in exploratory research. A robust model for identifying the factors that explain the success of Open Source Software (OSS) projects is created, validated and tested. The predictive modeling techniques of Logistic Regression (LR), Decision Trees (DT) and Neural Networks (NN) are used together in this analysis. Using Text Mining resul...

متن کامل

Development of a goal programming model for optimization of truck allocation in open pit mines

Truck and shovel operations comprise approximately 60% of the total operating costs in open pit mines. In order to increase productivity and reduce the cost of mining operations, it is essential to manage the equipment used with high efficiency. In this work, the chance-constrained goal programing (CCGP) model presented by Michalakopoulos and Panagiotou is developed to determine an optimal truc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005